A Framework for Sampling-Based XML Data Pricing
نویسندگان
چکیده
While price and data quality should define the major tradeoff for consumers in data markets, prices are usually prescribed by vendors and data quality is not negotiable. In this paper we study a model where data quality can be traded for a discount. We focus on the case of XML documents and consider completeness as the quality dimension. In our setting, the data provider offers an XML document, and sets both the price of the document and a weight to each node of the document, depending on its potential worth. The data consumer proposes a price. If the proposed price is lower than that of the entire document, then the data consumer receives a sample, i.e., a random rooted subtree of the document whose selection depends on the discounted price and the weight of nodes. By requesting several samples, the data consumer can iteratively explore the data in the document. We present a pseudo-polynomial time algorithm to select a rooted subtree with prescribed weight uniformly at random, but show that this problem is unfortunately intractable. Yet, we are able to identify several practical cases where our algorithm runs in polynomial time. The first case is uniform random sampling of a rooted subtree with prescribed size rather than weights; the second case restricts to binary weights. As a more challenging scenario for the sampling problem, we also study the uniform sampling of a rooted subtree of prescribed weight and prescribed height. We adapt our pseudo-polynomial time algorithm to this setting and identify tractable cases.
منابع مشابه
A Differentiated Pricing Framework for Improving the Performance of the Elastic Traffics in Data Networks
Rate allocation has become a demanding task in data networks as diversity in users and traffics proliferate. Most commonly used algorithm in end hosts is TCP. This is a loss based scheme therefore it exhibits oscillatory behavior which reduces network performance. Moreover, since the price for all sessions is based on the aggregate throughput, losses that are caused by TCP affect other sessions...
متن کاملApply Uncertainty in Document-Oriented Database (MongoDB) Using F-XML
As moving to big data world where data is increasing in unstructured way with high velocity, there is a need of data-store to store this bundle amount of data. Traditionally, relational databases are used which are now not compatible to handle this large amount of data, so it is needed to move on to non-relational data-stores. In the current study, we have proposed an extension of the Mongo...
متن کاملApply Uncertainty in Document-Oriented Database (MongoDB) Using F-XML
As moving to big data world where data is increasing in unstructured way with high velocity, there is a need of data-store to store this bundle amount of data. Traditionally, relational databases are used which are now not compatible to handle this large amount of data, so it is needed to move on to non-relational data-stores. In the current study, we have proposed an extension of the Mongo...
متن کاملGet a Sample for a Discount - Sampling-Based XML Data Pricing
While price and data quality should define the major tradeoff for consumers in data markets, prices are usually prescribed by vendors and data quality is not negotiable. In this paper we study a model where data quality can be traded for a discount. We focus on the case of XML documents and consider completeness as the quality dimension. In our setting, the data provider offers an XML document,...
متن کاملInvestigating the Role of real Money Balances in Households' Preferences function in the Framework of the Assets Pricing Models (M-CCAPM): Case study of Iran
In this paper, we try to develop and modify the basic model of the consumption-based capital asset pricing model by adding the growth in real money balances rate as a risk factor in the household's utility function as (M-CCAPM). For this purpose, two forms of utility function with constant relative risk aversion (CRRA) preferences and recursive preferences have been used such that M1 and M2 are...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Trans. Large-Scale Data- and Knowledge-Centered Systems
دوره 24 شماره
صفحات -
تاریخ انتشار 2016